Overview

Brought to you by YData

Dataset statistics

Number of variables 12
Number of observations 1088
Missing cells 0
Missing cells (%) 0.0%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 102.1 KiB
Average record size in memory 96.1 B

Variable types

Numeric 2
Text 8
Categorical 2

Alerts

atc2_concept_name is highly overall correlated with atc2_concept_code and 2 other fields High correlation
atc2_concept_code is highly overall correlated with atc2_concept_name and 2 other fields High correlation
atc1_concept_name is highly overall correlated with atc1_concept_code High correlation
atc1_concept_code is highly overall correlated with atc1_concept_name High correlation
atc_concept_id has unique values Unique
atc_concept_code has unique values Unique

Reproduction

Analysis started 2025-04-28 13:39:13.432034
Analysis finished 2025-04-28 13:39:14.871222
Duration 1.44 second
Software version ydata-profiling vv4.16.1
Download configuration config.json

Variables

atc_concept_id
Real number (ℝ)

Unique 

Distinct 1088
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 23152967
Minimum 1588648
Maximum 45893497
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:14.991002 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 1588648
5-th percentile 21600412
Q1 21601788
median 21603166
Q3 21604380
95-th percentile 40253776
Maximum 45893497
Range 44304849
Interquartile range (IQR) 2592.5

Descriptive statistics

Standard deviation 5631659.2
Coefficient of variation (CV) 0.24323704
Kurtosis 8.65969
Mean 23152967
Median Absolute Deviation (MAD) 1295.5
Skewness 3.045367
Sum 2.5190428 × 1010
Variance 3.1715585 × 1013
Monotonicity Strictly increasing
2025-04-28T20:39:15.180190 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
43534863 1
 
0.1%
43534853 1
 
0.1%
43534833 1
 
0.1%
43534827 1
 
0.1%
43534824 1
 
0.1%
43534823 1
 
0.1%
43534822 1
 
0.1%
43534821 1
 
0.1%
43534815 1
 
0.1%
43534814 1
 
0.1%
Other values (1078) 1078
99.1%
Value Count Frequency (%)
1588648 1
0.1%
1588697 1
0.1%
21600005 1
0.1%
21600008 1
0.1%
21600012 1
0.1%
21600013 1
0.1%
21600019 1
0.1%
21600034 1
0.1%
21600056 1
0.1%
21600082 1
0.1%
Value Count Frequency (%)
45893497 1
0.1%
45893489 1
0.1%
45893488 1
0.1%
45893476 1
0.1%
45893474 1
0.1%
45893464 1
0.1%
45893463 1
0.1%
45893461 1
0.1%
45893458 1
0.1%
45893267 1
0.1%
Distinct 1085
Distinct (%) 99.7%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:15.380893 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 65
Median length 52
Mean length 20.097426
Min length 5

Characters and Unicode

Total characters 21866
Distinct characters 59
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 1082 ?
Unique (%) 99.4%

Sample

1st row valsartan and sacubitril
2nd row ivacaftor and lumacaftor
3rd row sodium fluoride; oral
4th row stannous fluoride; oral
5th row hydrogen peroxide; oral
Value Count Frequency (%)
oral 411
 
17.8%
systemic 203
 
8.8%
parenteral 146
 
6.3%
topical 63
 
2.7%
rectal 43
 
1.9%
ophthalmic 41
 
1.8%
and 28
 
1.2%
acid 21
 
0.9%
inhalant 15
 
0.7%
nasal 15
 
0.7%
Other values (1097) 1317
57.2%
2025-04-28T20:39:15.778994 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
a 2206
 
10.1%
e 1925
 
8.8%
i 1843
 
8.4%
o 1639
 
7.5%
l 1592
 
7.3%
r 1592
 
7.3%
t 1344
 
6.1%
n 1324
 
6.1%
1215
 
5.6%
s 896
 
4.1%
Other values (49) 6290
28.8%

Most occurring categories

Value Count Frequency (%)
(unknown) 21866
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
a 2206
 
10.1%
e 1925
 
8.8%
i 1843
 
8.4%
o 1639
 
7.5%
l 1592
 
7.3%
r 1592
 
7.3%
t 1344
 
6.1%
n 1324
 
6.1%
1215
 
5.6%
s 896
 
4.1%
Other values (49) 6290
28.8%

Most occurring scripts

Value Count Frequency (%)
(unknown) 21866
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
a 2206
 
10.1%
e 1925
 
8.8%
i 1843
 
8.4%
o 1639
 
7.5%
l 1592
 
7.3%
r 1592
 
7.3%
t 1344
 
6.1%
n 1324
 
6.1%
1215
 
5.6%
s 896
 
4.1%
Other values (49) 6290
28.8%

Most occurring blocks

Value Count Frequency (%)
(unknown) 21866
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
a 2206
 
10.1%
e 1925
 
8.8%
i 1843
 
8.4%
o 1639
 
7.5%
l 1592
 
7.3%
r 1592
 
7.3%
t 1344
 
6.1%
n 1324
 
6.1%
1215
 
5.6%
s 896
 
4.1%
Other values (49) 6290
28.8%

atc_concept_code
Text

Unique 

Distinct 1088
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:16.114405 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 7
Median length 7
Mean length 7
Min length 7

Characters and Unicode

Total characters 7616
Distinct characters 29
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 1088 ?
Unique (%) 100.0%

Sample

1st row C09DX04
2nd row R07AX30
3rd row A01AA01
4th row A01AA04
5th row A01AB02
Value Count Frequency (%)
s01gx11 1
 
0.1%
v03ax03 1
 
0.1%
n06ax26 1
 
0.1%
v03ae05 1
 
0.1%
l04ac11 1
 
0.1%
l03ab13 1
 
0.1%
a10bd16 1
 
0.1%
j05ar14 1
 
0.1%
l01xe27 1
 
0.1%
c10ax13 1
 
0.1%
Other values (1078) 1078
99.1%
2025-04-28T20:39:16.612350 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
0 1947
25.6%
A 1020
13.4%
1 843
11.1%
B 451
 
5.9%
C 378
 
5.0%
2 361
 
4.7%
3 312
 
4.1%
X 248
 
3.3%
4 230
 
3.0%
5 222
 
2.9%
Other values (19) 1604
21.1%

Most occurring categories

Value Count Frequency (%)
(unknown) 7616
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
0 1947
25.6%
A 1020
13.4%
1 843
11.1%
B 451
 
5.9%
C 378
 
5.0%
2 361
 
4.7%
3 312
 
4.1%
X 248
 
3.3%
4 230
 
3.0%
5 222
 
2.9%
Other values (19) 1604
21.1%

Most occurring scripts

Value Count Frequency (%)
(unknown) 7616
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
0 1947
25.6%
A 1020
13.4%
1 843
11.1%
B 451
 
5.9%
C 378
 
5.0%
2 361
 
4.7%
3 312
 
4.1%
X 248
 
3.3%
4 230
 
3.0%
5 222
 
2.9%
Other values (19) 1604
21.1%

Most occurring blocks

Value Count Frequency (%)
(unknown) 7616
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
0 1947
25.6%
A 1020
13.4%
1 843
11.1%
B 451
 
5.9%
C 378
 
5.0%
2 361
 
4.7%
3 312
 
4.1%
X 248
 
3.3%
4 230
 
3.0%
5 222
 
2.9%
Other values (19) 1604
21.1%

ndrugreports
Real number (ℝ)

Distinct 483
Distinct (%) 44.4%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 553.87132
Minimum 1
Maximum 14508
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:16.801091 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 2
Q1 17
median 74
Q3 343
95-th percentile 2756.25
Maximum 14508
Range 14507
Interquartile range (IQR) 326

Descriptive statistics

Standard deviation 1456.6921
Coefficient of variation (CV) 2.6300189
Kurtosis 32.45299
Mean 553.87132
Median Absolute Deviation (MAD) 69
Skewness 5.093829
Sum 602612
Variance 2121951.8
Monotonicity Not monotonic
2025-04-28T20:39:17.192199 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
1 37
 
3.4%
3 34
 
3.1%
2 32
 
2.9%
9 20
 
1.8%
10 19
 
1.7%
5 19
 
1.7%
11 17
 
1.6%
19 15
 
1.4%
6 15
 
1.4%
7 14
 
1.3%
Other values (473) 866
79.6%
Value Count Frequency (%)
1 37
3.4%
2 32
2.9%
3 34
3.1%
4 13
 
1.2%
5 19
1.7%
6 15
1.4%
7 14
 
1.3%
8 12
 
1.1%
9 20
1.8%
10 19
1.7%
Value Count Frequency (%)
14508 1
0.1%
13385 1
0.1%
12625 1
0.1%
12320 1
0.1%
12078 1
0.1%
10111 1
0.1%
9756 1
0.1%
9116 1
0.1%
8312 1
0.1%
8212 1
0.1%
Distinct 395
Distinct (%) 36.3%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:17.464106 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 92
Median length 53
Mean length 27.363051
Min length 3

Characters and Unicode

Total characters 29771
Distinct characters 60
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 168 ?
Unique (%) 15.4%

Sample

1st row Angiotensin II receptor blockers (ARBs), other combinations
2nd row Other respiratory system products
3rd row Caries prophylactic agents
4th row Caries prophylactic agents
5th row Antiinfectives and antiseptics for local oral treatment
Value Count Frequency (%)
other 181
 
5.5%
and 148
 
4.5%
inhibitors 145
 
4.4%
derivatives 121
 
3.7%
agents 93
 
2.8%
for 79
 
2.4%
drugs 49
 
1.5%
analogues 43
 
1.3%
plain 42
 
1.3%
selective 42
 
1.3%
Other values (512) 2349
71.4%
2025-04-28T20:39:17.921765 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
i 2906
 
9.8%
e 2753
 
9.2%
t 2342
 
7.9%
n 2305
 
7.7%
2204
 
7.4%
a 2164
 
7.3%
s 2098
 
7.0%
o 1889
 
6.3%
r 1868
 
6.3%
c 1010
 
3.4%
Other values (50) 8232
27.7%

Most occurring categories

Value Count Frequency (%)
(unknown) 29771
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
i 2906
 
9.8%
e 2753
 
9.2%
t 2342
 
7.9%
n 2305
 
7.7%
2204
 
7.4%
a 2164
 
7.3%
s 2098
 
7.0%
o 1889
 
6.3%
r 1868
 
6.3%
c 1010
 
3.4%
Other values (50) 8232
27.7%

Most occurring scripts

Value Count Frequency (%)
(unknown) 29771
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
i 2906
 
9.8%
e 2753
 
9.2%
t 2342
 
7.9%
n 2305
 
7.7%
2204
 
7.4%
a 2164
 
7.3%
s 2098
 
7.0%
o 1889
 
6.3%
r 1868
 
6.3%
c 1010
 
3.4%
Other values (50) 8232
27.7%

Most occurring blocks

Value Count Frequency (%)
(unknown) 29771
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
i 2906
 
9.8%
e 2753
 
9.2%
t 2342
 
7.9%
n 2305
 
7.7%
2204
 
7.4%
a 2164
 
7.3%
s 2098
 
7.0%
o 1889
 
6.3%
r 1868
 
6.3%
c 1010
 
3.4%
Other values (50) 8232
27.7%
Distinct 418
Distinct (%) 38.4%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:18.256791 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 5
Median length 5
Mean length 4.9558824
Min length 3

Characters and Unicode

Total characters 5392
Distinct characters 31
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 183 ?
Unique (%) 16.8%

Sample

1st row C09DX
2nd row R07AX
3rd row A01AA
4th row A01AA
5th row A01AB
Value Count Frequency (%)
nan 24
 
2.2%
l01xe 23
 
2.1%
l01xx 18
 
1.7%
l04aa 15
 
1.4%
d07ac 10
 
0.9%
n03ax 10
 
0.9%
b01ac 10
 
0.9%
c09aa 9
 
0.8%
n06ax 9
 
0.8%
n06aa 9
 
0.8%
Other values (408) 951
87.4%
2025-04-28T20:39:18.763373 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
0 1025
19.0%
A 995
18.5%
B 441
 
8.2%
1 412
 
7.6%
C 372
 
6.9%
X 233
 
4.3%
D 200
 
3.7%
N 175
 
3.2%
L 155
 
2.9%
3 154
 
2.9%
Other values (21) 1230
22.8%

Most occurring categories

Value Count Frequency (%)
(unknown) 5392
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
0 1025
19.0%
A 995
18.5%
B 441
 
8.2%
1 412
 
7.6%
C 372
 
6.9%
X 233
 
4.3%
D 200
 
3.7%
N 175
 
3.2%
L 155
 
2.9%
3 154
 
2.9%
Other values (21) 1230
22.8%

Most occurring scripts

Value Count Frequency (%)
(unknown) 5392
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
0 1025
19.0%
A 995
18.5%
B 441
 
8.2%
1 412
 
7.6%
C 372
 
6.9%
X 233
 
4.3%
D 200
 
3.7%
N 175
 
3.2%
L 155
 
2.9%
3 154
 
2.9%
Other values (21) 1230
22.8%

Most occurring blocks

Value Count Frequency (%)
(unknown) 5392
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
0 1025
19.0%
A 995
18.5%
B 441
 
8.2%
1 412
 
7.6%
C 372
 
6.9%
X 233
 
4.3%
D 200
 
3.7%
N 175
 
3.2%
L 155
 
2.9%
3 154
 
2.9%
Other values (21) 1230
22.8%
Distinct 176
Distinct (%) 16.2%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:19.072564 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 71
Median length 53
Mean length 28.499081
Min length 3

Characters and Unicode

Total characters 31007
Distinct characters 38
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 34 ?
Unique (%) 3.1%

Sample

1st row ANGIOTENSIN II RECEPTOR BLOCKERS (ARBs), COMBINATIONS
2nd row OTHER RESPIRATORY SYSTEM PRODUCTS
3rd row STOMATOLOGICAL PREPARATIONS
4th row STOMATOLOGICAL PREPARATIONS
5th row STOMATOLOGICAL PREPARATIONS
Value Count Frequency (%)
and 220
 
6.1%
agents 217
 
6.1%
other 205
 
5.7%
for 151
 
4.2%
drugs 98
 
2.7%
use 83
 
2.3%
products 77
 
2.1%
preparations 72
 
2.0%
acting 62
 
1.7%
plain 62
 
1.7%
Other values (259) 2338
65.2%
2025-04-28T20:39:19.588735 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
A 2914
 
9.4%
T 2892
 
9.3%
I 2742
 
8.8%
2497
 
8.1%
S 2479
 
8.0%
E 2304
 
7.4%
N 2294
 
7.4%
O 1932
 
6.2%
R 1789
 
5.8%
C 1429
 
4.6%
Other values (28) 7735
24.9%

Most occurring categories

Value Count Frequency (%)
(unknown) 31007
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
A 2914
 
9.4%
T 2892
 
9.3%
I 2742
 
8.8%
2497
 
8.1%
S 2479
 
8.0%
E 2304
 
7.4%
N 2294
 
7.4%
O 1932
 
6.2%
R 1789
 
5.8%
C 1429
 
4.6%
Other values (28) 7735
24.9%

Most occurring scripts

Value Count Frequency (%)
(unknown) 31007
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
A 2914
 
9.4%
T 2892
 
9.3%
I 2742
 
8.8%
2497
 
8.1%
S 2479
 
8.0%
E 2304
 
7.4%
N 2294
 
7.4%
O 1932
 
6.2%
R 1789
 
5.8%
C 1429
 
4.6%
Other values (28) 7735
24.9%

Most occurring blocks

Value Count Frequency (%)
(unknown) 31007
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
A 2914
 
9.4%
T 2892
 
9.3%
I 2742
 
8.8%
2497
 
8.1%
S 2479
 
8.0%
E 2304
 
7.4%
N 2294
 
7.4%
O 1932
 
6.2%
R 1789
 
5.8%
C 1429
 
4.6%
Other values (28) 7735
24.9%
Distinct 176
Distinct (%) 16.2%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:19.916772 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 4
Median length 4
Mean length 3.9779412
Min length 3

Characters and Unicode

Total characters 4328
Distinct characters 30
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 34 ?
Unique (%) 3.1%

Sample

1st row C09D
2nd row R07A
3rd row A01A
4th row A01A
5th row A01A
Value Count Frequency (%)
l01x 53
 
4.9%
j05a 39
 
3.6%
l04a 34
 
3.1%
n03a 26
 
2.4%
n06a 26
 
2.4%
nan 24
 
2.2%
v03a 23
 
2.1%
j01d 22
 
2.0%
b01a 21
 
1.9%
n05a 20
 
1.8%
Other values (166) 800
73.5%
2025-04-28T20:39:20.388201 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
0 1025
23.7%
A 670
15.5%
1 412
9.5%
B 253
 
5.8%
C 223
 
5.2%
N 174
 
4.0%
L 154
 
3.6%
3 154
 
3.6%
D 133
 
3.1%
5 122
 
2.8%
Other values (20) 1008
23.3%

Most occurring categories

Value Count Frequency (%)
(unknown) 4328
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
0 1025
23.7%
A 670
15.5%
1 412
9.5%
B 253
 
5.8%
C 223
 
5.2%
N 174
 
4.0%
L 154
 
3.6%
3 154
 
3.6%
D 133
 
3.1%
5 122
 
2.8%
Other values (20) 1008
23.3%

Most occurring scripts

Value Count Frequency (%)
(unknown) 4328
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
0 1025
23.7%
A 670
15.5%
1 412
9.5%
B 253
 
5.8%
C 223
 
5.2%
N 174
 
4.0%
L 154
 
3.6%
3 154
 
3.6%
D 133
 
3.1%
5 122
 
2.8%
Other values (20) 1008
23.3%

Most occurring blocks

Value Count Frequency (%)
(unknown) 4328
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
0 1025
23.7%
A 670
15.5%
1 412
9.5%
B 253
 
5.8%
C 223
 
5.2%
N 174
 
4.0%
L 154
 
3.6%
3 154
 
3.6%
D 133
 
3.1%
5 122
 
2.8%
Other values (20) 1008
23.3%

atc2_concept_name
Text

High correlation 

Distinct 80
Distinct (%) 7.4%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:20.648909 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 64
Median length 41
Mean length 24.675551
Min length 3

Characters and Unicode

Total characters 26847
Distinct characters 32
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 6 ?
Unique (%) 0.6%

Sample

1st row AGENTS ACTING ON THE RENIN-ANGIOTENSIN SYSTEM
2nd row OTHER RESPIRATORY SYSTEM PRODUCTS
3rd row STOMATOLOGICAL PREPARATIONS
4th row STOMATOLOGICAL PREPARATIONS
5th row STOMATOLOGICAL PREPARATIONS
Value Count Frequency (%)
for 209
 
6.9%
agents 190
 
6.3%
use 148
 
4.9%
and 143
 
4.7%
systemic 125
 
4.1%
drugs 104
 
3.4%
antineoplastic 90
 
3.0%
other 76
 
2.5%
system 69
 
2.3%
products 68
 
2.3%
Other values (131) 1798
59.5%
2025-04-28T20:39:21.095915 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
S 2509
 
9.3%
T 2493
 
9.3%
A 2480
 
9.2%
I 2250
 
8.4%
E 2101
 
7.8%
1932
 
7.2%
N 1841
 
6.9%
O 1739
 
6.5%
R 1459
 
5.4%
C 1262
 
4.7%
Other values (22) 6781
25.3%

Most occurring categories

Value Count Frequency (%)
(unknown) 26847
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
S 2509
 
9.3%
T 2493
 
9.3%
A 2480
 
9.2%
I 2250
 
8.4%
E 2101
 
7.8%
1932
 
7.2%
N 1841
 
6.9%
O 1739
 
6.5%
R 1459
 
5.4%
C 1262
 
4.7%
Other values (22) 6781
25.3%

Most occurring scripts

Value Count Frequency (%)
(unknown) 26847
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
S 2509
 
9.3%
T 2493
 
9.3%
A 2480
 
9.2%
I 2250
 
8.4%
E 2101
 
7.8%
1932
 
7.2%
N 1841
 
6.9%
O 1739
 
6.5%
R 1459
 
5.4%
C 1262
 
4.7%
Other values (22) 6781
25.3%

Most occurring blocks

Value Count Frequency (%)
(unknown) 26847
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
S 2509
 
9.3%
T 2493
 
9.3%
A 2480
 
9.2%
I 2250
 
8.4%
E 2101
 
7.8%
1932
 
7.2%
N 1841
 
6.9%
O 1739
 
6.5%
R 1459
 
5.4%
C 1262
 
4.7%
Other values (22) 6781
25.3%

atc2_concept_code
Text

High correlation 

Distinct 80
Distinct (%) 7.4%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
2025-04-28T20:39:21.339314 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Length

Max length 3
Median length 3
Mean length 3
Min length 3

Characters and Unicode

Total characters 3264
Distinct characters 26
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 6 ?
Unique (%) 0.6%

Sample

1st row C09
2nd row R07
3rd row A01
4th row A01
5th row A01
Value Count Frequency (%)
l01 90
 
8.3%
j01 53
 
4.9%
s01 44
 
4.0%
n05 42
 
3.9%
n06 39
 
3.6%
j05 39
 
3.6%
l04 34
 
3.1%
n02 27
 
2.5%
g03 26
 
2.4%
n03 26
 
2.4%
Other values (70) 668
61.4%
2025-04-28T20:39:21.717415 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

Value Count Frequency (%)
0 1025
31.4%
1 412
12.6%
N 174
 
5.3%
3 154
 
4.7%
L 152
 
4.7%
C 123
 
3.8%
A 123
 
3.8%
5 122
 
3.7%
J 113
 
3.5%
6 107
 
3.3%
Other values (16) 759
23.3%

Most occurring categories

Value Count Frequency (%)
(unknown) 3264
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
0 1025
31.4%
1 412
12.6%
N 174
 
5.3%
3 154
 
4.7%
L 152
 
4.7%
C 123
 
3.8%
A 123
 
3.8%
5 122
 
3.7%
J 113
 
3.5%
6 107
 
3.3%
Other values (16) 759
23.3%

Most occurring scripts

Value Count Frequency (%)
(unknown) 3264
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
0 1025
31.4%
1 412
12.6%
N 174
 
5.3%
3 154
 
4.7%
L 152
 
4.7%
C 123
 
3.8%
A 123
 
3.8%
5 122
 
3.7%
J 113
 
3.5%
6 107
 
3.3%
Other values (16) 759
23.3%

Most occurring blocks

Value Count Frequency (%)
(unknown) 3264
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
0 1025
31.4%
1 412
12.6%
N 174
 
5.3%
3 154
 
4.7%
L 152
 
4.7%
C 123
 
3.8%
A 123
 
3.8%
5 122
 
3.7%
J 113
 
3.5%
6 107
 
3.3%
Other values (16) 759
23.3%

atc1_concept_name
Categorical

High correlation 

Distinct 15
Distinct (%) 1.4%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
NERVOUS SYSTEM
174 
ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS
152 
ALIMENTARY TRACT AND METABOLISM
123 
CARDIOVASCULAR SYSTEM
123 
ANTIINFECTIVES FOR SYSTEMIC USE
113 
Other values (10)
403 

Length

Max length 63
Median length 42
Mean length 26.429228
Min length 3

Characters and Unicode

Total characters 28755
Distinct characters 28
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row CARDIOVASCULAR SYSTEM
2nd row RESPIRATORY SYSTEM
3rd row ALIMENTARY TRACT AND METABOLISM
4th row ALIMENTARY TRACT AND METABOLISM
5th row ALIMENTARY TRACT AND METABOLISM

Common Values

Value Count Frequency (%)
NERVOUS SYSTEM 174
16.0%
ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS 152
14.0%
ALIMENTARY TRACT AND METABOLISM 123
11.3%
CARDIOVASCULAR SYSTEM 123
11.3%
ANTIINFECTIVES FOR SYSTEMIC USE 113
10.4%
DERMATOLOGICALS 66
 
6.1%
GENITO URINARY SYSTEM AND SEX HORMONES 51
 
4.7%
BLOOD AND BLOOD FORMING ORGANS 49
 
4.5%
RESPIRATORY SYSTEM 48
 
4.4%
SENSORY ORGANS 45
 
4.1%
Other values (5) 144
13.2%

Length

2025-04-28T20:39:21.891616 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
system 440
 
12.8%
and 424
 
12.4%
nervous 174
 
5.1%
antineoplastic 152
 
4.4%
immunomodulating 152
 
4.4%
agents 152
 
4.4%
systemic 144
 
4.2%
tract 123
 
3.6%
metabolism 123
 
3.6%
cardiovascular 123
 
3.6%
Other values (25) 1419
41.4%

Most occurring characters

Value Count Frequency (%)
S 2878
10.0%
A 2549
 
8.9%
2338
 
8.1%
E 2261
 
7.9%
N 2257
 
7.8%
T 2240
 
7.8%
I 1953
 
6.8%
O 1950
 
6.8%
M 1681
 
5.8%
R 1535
 
5.3%
Other values (18) 7113
24.7%

Most occurring categories

Value Count Frequency (%)
(unknown) 28755
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
S 2878
10.0%
A 2549
 
8.9%
2338
 
8.1%
E 2261
 
7.9%
N 2257
 
7.8%
T 2240
 
7.8%
I 1953
 
6.8%
O 1950
 
6.8%
M 1681
 
5.8%
R 1535
 
5.3%
Other values (18) 7113
24.7%

Most occurring scripts

Value Count Frequency (%)
(unknown) 28755
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
S 2878
10.0%
A 2549
 
8.9%
2338
 
8.1%
E 2261
 
7.9%
N 2257
 
7.8%
T 2240
 
7.8%
I 1953
 
6.8%
O 1950
 
6.8%
M 1681
 
5.8%
R 1535
 
5.3%
Other values (18) 7113
24.7%

Most occurring blocks

Value Count Frequency (%)
(unknown) 28755
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
S 2878
10.0%
A 2549
 
8.9%
2338
 
8.1%
E 2261
 
7.9%
N 2257
 
7.8%
T 2240
 
7.8%
I 1953
 
6.8%
O 1950
 
6.8%
M 1681
 
5.8%
R 1535
 
5.3%
Other values (18) 7113
24.7%

atc1_concept_code
Categorical

High correlation 

Distinct 15
Distinct (%) 1.4%
Missing 0
Missing (%) 0.0%
Memory size 8.6 KiB
N
174 
L
152 
A
123 
C
123 
J
113 
Other values (10)
403 

Length

Max length 3
Median length 1
Mean length 1.0441176
Min length 1

Characters and Unicode

Total characters 1136
Distinct characters 16
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row C
2nd row R
3rd row A
4th row A
5th row A

Common Values

Value Count Frequency (%)
N 174
16.0%
L 152
14.0%
A 123
11.3%
C 123
11.3%
J 113
10.4%
D 66
 
6.1%
G 51
 
4.7%
B 49
 
4.5%
R 48
 
4.4%
S 45
 
4.1%
Other values (5) 144
13.2%

Length

2025-04-28T20:39:22.052034 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category
Value Count Frequency (%)
n 174
16.0%
l 152
14.0%
a 123
11.3%
c 123
11.3%
j 113
10.4%
d 66
 
6.1%
g 51
 
4.7%
b 49
 
4.5%
r 48
 
4.4%
s 45
 
4.1%
Other values (5) 144
13.2%

Most occurring characters

Value Count Frequency (%)
N 174
15.3%
L 152
13.4%
A 123
10.8%
C 123
10.8%
J 113
9.9%
D 66
 
5.8%
G 51
 
4.5%
B 49
 
4.3%
R 48
 
4.2%
n 48
 
4.2%
Other values (6) 189
16.6%

Most occurring categories

Value Count Frequency (%)
(unknown) 1136
100.0%

Most frequent character per category

(unknown)
Value Count Frequency (%)
N 174
15.3%
L 152
13.4%
A 123
10.8%
C 123
10.8%
J 113
9.9%
D 66
 
5.8%
G 51
 
4.5%
B 49
 
4.3%
R 48
 
4.2%
n 48
 
4.2%
Other values (6) 189
16.6%

Most occurring scripts

Value Count Frequency (%)
(unknown) 1136
100.0%

Most frequent character per script

(unknown)
Value Count Frequency (%)
N 174
15.3%
L 152
13.4%
A 123
10.8%
C 123
10.8%
J 113
9.9%
D 66
 
5.8%
G 51
 
4.5%
B 49
 
4.3%
R 48
 
4.2%
n 48
 
4.2%
Other values (6) 189
16.6%

Most occurring blocks

Value Count Frequency (%)
(unknown) 1136
100.0%

Most frequent character per block

(unknown)
Value Count Frequency (%)
N 174
15.3%
L 152
13.4%
A 123
10.8%
C 123
10.8%
J 113
9.9%
D 66
 
5.8%
G 51
 
4.5%
B 49
 
4.3%
R 48
 
4.2%
n 48
 
4.2%
Other values (6) 189
16.6%

Interactions

2025-04-28T20:39:14.185395 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:13.936196 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:14.315306 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
2025-04-28T20:39:14.051968 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/

Correlations

2025-04-28T20:39:22.151313 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
atc_concept_id ndrugreports
atc_concept_id 1.000 -0.081
ndrugreports -0.081 1.000
2025-04-28T20:39:22.271712 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
atc_concept_id ndrugreports
atc_concept_id 1.000 -0.040
ndrugreports -0.040 1.000
2025-04-28T20:39:22.386306 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
atc_concept_id ndrugreports
atc_concept_id 1.000 -0.025
ndrugreports -0.025 1.000
2025-04-28T20:39:22.511020 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
atc_concept_id ndrugreports atc2_concept_name atc2_concept_code atc1_concept_name atc1_concept_code
atc_concept_id 1.000 0.000 0.274 0.274 0.184 0.184
ndrugreports 0.000 1.000 0.075 0.075 0.151 0.151
atc2_concept_name 0.274 0.075 1.000 1.000 1.000 1.000
atc2_concept_code 0.274 0.075 1.000 1.000 1.000 1.000
atc1_concept_name 0.184 0.151 1.000 1.000 1.000 1.000
atc1_concept_code 0.184 0.151 1.000 1.000 1.000 1.000
2025-04-28T20:39:22.645228 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
atc1_concept_code atc1_concept_name
atc1_concept_code 1.000 1.000
atc1_concept_name 1.000 1.000

Missing values

2025-04-28T20:39:14.499334 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-28T20:39:14.758593 image/svg+xml Matplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

atc_concept_id atc_concept_name atc_concept_code ndrugreports atc4_concept_name atc4_concept_code atc3_concept_name atc3_concept_code atc2_concept_name atc2_concept_code atc1_concept_name atc1_concept_code
0 1588648 valsartan and sacubitril C09DX04 1 Angiotensin II receptor blockers (ARBs), other combinations C09DX ANGIOTENSIN II RECEPTOR BLOCKERS (ARBs), COMBINATIONS C09D AGENTS ACTING ON THE RENIN-ANGIOTENSIN SYSTEM C09 CARDIOVASCULAR SYSTEM C
1 1588697 ivacaftor and lumacaftor R07AX30 1142 Other respiratory system products R07AX OTHER RESPIRATORY SYSTEM PRODUCTS R07A OTHER RESPIRATORY SYSTEM PRODUCTS R07 RESPIRATORY SYSTEM R
2 21600005 sodium fluoride; oral A01AA01 73 Caries prophylactic agents A01AA STOMATOLOGICAL PREPARATIONS A01A STOMATOLOGICAL PREPARATIONS A01 ALIMENTARY TRACT AND METABOLISM A
3 21600008 stannous fluoride; oral A01AA04 66 Caries prophylactic agents A01AA STOMATOLOGICAL PREPARATIONS A01A STOMATOLOGICAL PREPARATIONS A01 ALIMENTARY TRACT AND METABOLISM A
4 21600012 hydrogen peroxide; oral A01AB02 9 Antiinfectives and antiseptics for local oral treatment A01AB STOMATOLOGICAL PREPARATIONS A01A STOMATOLOGICAL PREPARATIONS A01 ALIMENTARY TRACT AND METABOLISM A
5 21600013 chlorhexidine; oral A01AB03 91 Antiinfectives and antiseptics for local oral treatment A01AB STOMATOLOGICAL PREPARATIONS A01A STOMATOLOGICAL PREPARATIONS A01 ALIMENTARY TRACT AND METABOLISM A
6 21600019 miconazole; oral A01AB09 51 Antiinfectives and antiseptics for local oral treatment A01AB STOMATOLOGICAL PREPARATIONS A01A STOMATOLOGICAL PREPARATIONS A01 ALIMENTARY TRACT AND METABOLISM A
7 21600034 triamcinolone; oral A01AC01 168 Corticosteroids for local oral treatment A01AC STOMATOLOGICAL PREPARATIONS A01A STOMATOLOGICAL PREPARATIONS A01 ALIMENTARY TRACT AND METABOLISM A
8 21600056 aluminium hydroxide A02AB01 13 Aluminium compounds A02AB ANTACIDS A02A DRUGS FOR ACID RELATED DISORDERS A02 ALIMENTARY TRACT AND METABOLISM A
9 21600082 cimetidine; systemic A02BA01 84 H2-receptor antagonists A02BA DRUGS FOR PEPTIC ULCER AND GASTRO-OESOPHAGEAL REFLUX DISEASE (GORD) A02B DRUGS FOR ACID RELATED DISORDERS A02 ALIMENTARY TRACT AND METABOLISM A
atc_concept_id atc_concept_name atc_concept_code ndrugreports atc4_concept_name atc4_concept_code atc3_concept_name atc3_concept_code atc2_concept_name atc2_concept_code atc1_concept_name atc1_concept_code
1078 45893267 metformin and canagliflozin A10BD16 1 Combinations of oral blood glucose lowering drugs A10BD BLOOD GLUCOSE LOWERING DRUGS, EXCL. INSULINS A10B DRUGS USED IN DIABETES A10 ALIMENTARY TRACT AND METABOLISM A
1079 45893458 darunavir and cobicistat; systemic J05AR14 4 Antivirals for treatment of HIV infections, combinations J05AR DIRECT ACTING ANTIVIRALS J05A ANTIVIRALS FOR SYSTEMIC USE J05 ANTIINFECTIVES FOR SYSTEMIC USE J
1080 45893461 ibrutinib L01XE27 20 Protein kinase inhibitors L01XE OTHER ANTINEOPLASTIC AGENTS L01X ANTINEOPLASTIC AGENTS L01 ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS L
1081 45893463 evolocumab; parenteral C10AX13 36 Other lipid modifying agents C10AX LIPID MODIFYING AGENTS, PLAIN C10A LIPID MODIFYING AGENTS C10 CARDIOVASCULAR SYSTEM C
1082 45893464 sofosbuvir J05AX15 19 nan nan nan nan nan nan nan nan
1083 45893474 riociguat; oral C02KX05 77 Antihypertensives for pulmonary arterial hypertension C02KX OTHER ANTIHYPERTENSIVES C02K ANTIHYPERTENSIVES C02 CARDIOVASCULAR SYSTEM C
1084 45893476 enzalutamide; oral L02BB04 2 Anti-androgens L02BB HORMONE ANTAGONISTS AND RELATED AGENTS L02B ENDOCRINE THERAPY L02 ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS L
1085 45893488 vedolizumab; parenteral L04AA33 701 Selective immunosuppressants L04AA IMMUNOSUPPRESSANTS L04A IMMUNOSUPPRESSANTS L04 ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS L
1086 45893489 olaparib L01XX46 10 Other antineoplastic agents L01XX OTHER ANTINEOPLASTIC AGENTS L01X ANTINEOPLASTIC AGENTS L01 ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS L
1087 45893497 cabozantinib L01XE26 23 Protein kinase inhibitors L01XE OTHER ANTINEOPLASTIC AGENTS L01X ANTINEOPLASTIC AGENTS L01 ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS L